Biology-inspired graph neural network encodes reactome and reveals biochemical reactions of disease
نویسندگان
چکیده
•Biochemical reaction states are approximated with PCA-transformed RNA-seq count data•Our graph neural network outperforms both random rewired and deep learning controls•Analysis of our model reveals reactions associated psoriasis in prior studies•Traditional analysis approaches fail to discover the found method The Human Genome Project unlocked door a vast but unannotated collection genes. In following decades, annotations form biochemical graphs were painstakingly curated via experimental studies. Though gene set enrichment considers groups within these annotation graphs, it disregards group dependencies. Here, we utilize dependencies by generating based on Reactome show how integrating relationships from this expression values other studies can be used identify tissue-specific disease. future, similar could enable fruitful reanalyses work, highlighting influential pinpointing reactions. As more research databases become available, envision extensions work predicting effects rare or indistinct genetic variations guiding precision medicine. Functional heterogeneity healthy human tissues complicates interpretation molecular studies, impeding therapeutic target identification treatment. Considering this, generated Reactome-based architecture trained using 9,115 samples Genotype-Tissue Expression (GTEx). Our (GNN) achieves adjusted Rand index (ARI) = 0.7909, while Resnet18 control ARI 0.7781, 370 held-out tissue Cancer Atlas (TCGA), despite over 600 times parameters. GNN also succeeds separating 83 skin 95 lesional samples, revealing that upregulation 26S- NUB1-mediated degradation NEDD8, UBD, their conjugates is central largest perturbed component psoriasis. We results not discoverable traditional differential hypergeometric pathway analyses yet supported separate multi-omics small-molecule mouse suggesting future disease may benefit analytical approaches. Across tissues, mRNA abundance varies,1Melé M. Ferreira P.G. Reverter F. DeLuca D.S. Monlong J. Sammeth Young T.R. Goldmann J.M. Pervouchine D.D. Sullivan T.J. Johnson R. genomics. transcriptome across individuals.Science. 2015; 348: 660-665Crossref PubMed Scopus (792) Google Scholar,2Suntsova Gaifullin N. Allina D. Reshetun A. Li X. Mendeleeva L. Surin V. Sergeeva Spirin P. Prassolov et al.Atlas RNA sequencing profiles for normal tissues.Sci. Data. 2019; 6: 36-39Crossref (52) Scholar,3Wang Eraslan B. Wieland T. Hallström Hopf Zolg D.P. Zecha Asplund L.H. Meng C. al.A proteome atlas 29 tissues.Mol. Syst. Biol. 2019 Feb; 15e8503Crossref (306) Scholar consequent protein likewise observed,4Wilhelm Schlegl Hahne H. Gholami A.M. Lieberenz Savitski M.M. Ziegler E. Butzmann Gessulat S. Marx al.Mass-spectrometry-based draft proteome.Nature. 2014; 509: 582-587Crossref (1346) Scholar,5Aebersold Mann Mass-spectrometric exploration structure function.Nature. 2016; 537: 347-355Crossref (1154) affecting which what degree take place.6Blagoev Kratchmarova I. Ong S.E. Nielsen Foster L.J. A proteomics strategy elucidate functional protein-protein interactions applied EGF signaling.Nat. Biotechnol. 2003; 21: 315-318Crossref (607) Scholar,7Meisinger Sickmann Pfanner mitochondrial proteome: inventory function.Cell. 2008; 134: 22-24Abstract Full Text PDF (110) Scholar,8Lundberg Fagerberg Klevebring Matic Geiger Cox Algenäs Lundeberg Uhlen Defining three functionally different cell lines.Mol. 2010; 450Crossref (271) While understood harbor characteristic patterns,9Lonsdale Thomas Salvatore Phillips Lo Shad Hasz Walters G. Garcia al.The genotype-tissue (GTEx) project.Nat. Genet. 2013; 45: 580-585Crossref (4519) products exhibit complex cellular behaviors as result modification, variation abundances, compartment morphology, and, presumably, phenomena preclude straightforward extrapolation carrying out functions.10Krishna R.G. Wold Post-translational modifications proteins.Methods sequence analysis. 1993; : 167-172Crossref Scholar,11Mann Jensen O.N. Proteomic post-translational modifications.Nat. 255-261Crossref (1650) Scholar,12Minton A.P. How cells differ those test tubes?.J. Cell Sci. 2006; 119: 2863-2869Crossref (348) Scholar,13Chen B.J. Lam T.C. Liu L.Q. To C.H. applications eye research.Mol. Med. Rep. 2017 1; 15: 3923-3935Crossref (11) Scholar,14Ramazi Zahiri proteins: resources, tools prediction methods.Database. 2021; 2021: baab012Crossref (129) Scholar,15Li W. Zhang Lin H.K. Xu Insights into modification its emerging role shaping tumor microenvironment.Signal Transduct. Targeted Ther. 422-430Crossref (24) Despite considerable advances contemporary omics-based biological complexity remains largely hidden us. Achieving high confidence regarding likely occur assume particular contexts paramount order understand mechanisms development functions etiology Using sequencing,16Mortazavi Williams B.A. McCue K. Schaeffer Mapping quantifying mammalian transcriptomes RNA-Seq.Nat. Methods. 5: 621-628Crossref (10028) experimentalists able approximate present cells. many involve mRNA, specific synthesis required synthesis, thus patterns protein-regulated phenotypes states. Total positively correlated (R2 0.41 log-log scale, R2 0.44 nonlinear transformation17Schwanhäusser Busse Dittmar Schuchhardt Wolf Chen Selbach Global quantification control.Nature. 2011; 473: 337-342Crossref (4181) Scholar), when steady-state conditions met, has been shown primarily determined abundance,18Liu Y. Beyer Aebersold On dependency levels abundance.Cell. 165: 535-550Abstract (1468) between 56% 84% explained alone.19Li J.J. Bickel P.J. Biggin M.D. System wide have underestimated abundances importance transcription mammals.PeerJ. 2 (a)e270Crossref (109) any occur, necessary reactants must present; however, there direct correspondence protein’s measured availability participate due competitive occupancy among binding partners modifications.10Krishna Scholar,20Lash Putt D.A. Cai Drug metabolism enzyme activity primary cultures proximal tubular cells.Toxicology. 244: 56-65Crossref (59) Scholar,21Jens Rajewsky Competition sites regulators shapes post-transcriptional regulation.Nat. Rev. 16: 113-126Crossref (187) Furthermore, some components, such small molecules, interfering (siRNA), metal ions, organic inorganic compounds, currently unquantified manner remain unavailable consideration cross-tissue analyses. However, mediated enzymes resulting translation influence rate, act proxies states.22Rieder M.J. Carmona Krieger J.E. Pritchard Jr., K.A. Greene A.S. Suppression angiotensin-converting shear stress.Circ. Res. 1997; 80: 312-319Crossref (120) Scholar,23Wassmann Wassmann Nickenig Modulation oxidant antioxidant function vascular cells.Hypertension. 2004; 44: 381-386Crossref (264) Scholar,24Johnston J.A. W.W. Todd S.A. Coulson Murphy Irvine G.B. Passmore β-site amyloid precursor cleaving Alzheimer's disease.Biochem. Soc. Trans. 2005; 33: 1096-1100Crossref (62) Scholar,25García-López Häkkinen Cuevas Lima Kauhanen Mattila Sillanpää Ahtiainen J.P. Karavirta Almar González-Gallego Effects strength endurance training middle-aged men.Scand. Sports. 2007; 17: 595-604Crossref (44) Scholar,26Mauriz J.L. Molpeceres García-Mediavilla M.V. González Barrio Melatonin prevents oxidative stress changes liver aging rats.J. Pineal 42: 222-230Crossref (67) Scholar,27Ronis M.J.J. soy containing diet isoflavones cytochrome P450 activity.Drug Metab. 48: 331-341Crossref (47) Scholar,28Xie Yan Gong Zhu Z. Tan Hu Peng substrates lignocellulosic expression, activity, substrate utilization efficiency Pleurotus eryngii.Cell. Physiol. Biochem. 39: 1479-1494Crossref (28) Scholar,29Matsumoto Matsuzaki Oshikawa Goshima Mori Kawamura Ogawa Fukuda Nakatsumi Natsume large-scale targeted assay resource an vitro proteome.Nat. 2017; 14: 251-258Crossref (66) Scholar,30Dammann Stapelfeld Maser cortisol-activating 11β-hydroxysteroid dehydrogenase type 1 species-specific.Chem. Interact. 303: 57-61Crossref (12) Thus, metrics inferred identifying repeated multiple transcripts coding proteins Such transcript associate imply abundance17Schwanhäusser Scholar,18Liu Scholar,19Li and—by syllogism—the tissues. Within tissue, components react each network.31Stelzl U. Worm Lalowski Haenig Brembeck F.H. Goehler Stroedicke Zenkner Schoenherr Koeppen interaction network: annotating proteome.Cell. 122: 957-968Abstract (1897) Scholar,32Price N.D. Shmulevich Biochemical statistical models systems biology.Curr. Opin. 18: 365-370Crossref (54) Scholar,33Bossi Lehner Tissue specificity network.Mol. 2009; 260Crossref (262) Scholar,34Wu Feng Stein application cancer data analysis.Genome 11 (R53-R23)Crossref (485) Scholar,35Wu Haw construction discovery.in: Protein bioinformatics. Humana Press, New York, NY, 2017: 235-253Crossref (64) Tissues plausibly characterized combined networks alone. characterize leveraged paradigm, demonstrated scale large graphs.36Hamilton Ying Leskovec Inductive representation graphs.in: Advances Neural Information Processing Systems. 30. 2017Google (GTEx),9Lonsdale source (RNA-seq) Reactome,37Gillespie Jassal Stephan Milacic Rothfels Senff-Ribeiro Griss Sevilla Matthews reactome knowledgebase 2022.Nucleic Acids 2022; 50: D687-D692Crossref (232) most comprehensive database. database unique repertoire Each annotated approved experts traceable literature. reaction’s state constituent participants’ applying principal-component (PCA)38Pearson LIII. lines planes closest fit points space.London, Edinburgh Dublin Phil. Mag. 1901; 2: 559-572Crossref sets conceptually simple, computationally efficient, outcome-naive aggregation (Figure 1). then created architecture39Morris Ritzert Fey Hamilton W.L. Lenssen Rattan Grohe Weisfeiler leman go neural: higher-order networks.in: Proceedings AAAI Conference Artificial Intelligence. 33. 2019: 4602-4609Google classify 51 types transformed GTEx 7), embedding weights model. This critical overall workflow obtaining biologically significant because specifies interdependency. By including graph, consider information interdependent extract after training. contrast, architectures representative systems. validated transfer approach (results) where classifies (TCGA)40Weinstein J.N. Collisson E.A. Mills Shaw K.R.M. Ozenberger Ellrott Sander Stuart Research NetworkThe genome pan-cancer 1113-1120Crossref (4343) well conventional model41He Ren Sun Deep residual image recognition.in: IEEE conference computer vision pattern recognition. 2016: 770-778Crossref (103119) same data, than study comparing samples,42Li Tsoi L.C. Swindell W.R. Gudjonsson Tejasvi Johnston Ding P.E. Xing Kochkodan al.Transcriptome case–control sample: provides insights mechanisms.J. Invest. Dermatol. 1828-1838Abstract (258) recovers promote integrative genomics study43Zhao Jhamb Shu Arneson Rajpal D.K. Yang Multi-omics integration psoriasis.BMC 13 (8-4)Crossref (27) model,44Huang Gao Fan Wang Zhong H.Y. Cao Q. Zhou al.CRL4DCAF2 negatively regulates IL-23 production dendritic limits psoriasis.J. Exp. 2018; 215: 1999-2017Crossref (22) analyses.Figure 7Reaction decomposition inputShow full caption(A) hierarchy network.(B) network.(C) architecture.(D) 1-GNN GraphConv() aggregator function, passed through layers size 64 types.View Large Image Figure ViewerDownload Hi-res Download (PPT) (A) network. (B) (C) architecture. (D) types. TCGA two largest-scale conducted preceding decade; made publicly available commonly subject reanalyses. studies’ others hosted Sequence Read Archive (SRA) reprocessed uniform way single pipeline Recount2 project,45Collado-Torres Nellore Kammers Ellis Taub M.A. Hansen K.D. Jaffe A.E. Langmead Leek J.T. Reproducible recount2.Nat. 35: 319-321Crossref (199) sample phenotype online portal at https://jhubiostatistics.shinyapps.io/recount/. opted use Recount(-ed) train represented (9, 115) reserved downstream validation. calculated reaction-specific graphs. downloaded grouped many-to-many fashion according participated. counts ranged 5 (endocervix) 475 (skeletal muscle), indicated Table 1. considered several methods reduce dimensionality t-distributed stochastic neighbor (t-SNE),46Van der Maaten Hinton Visualizing t-SNE.J. Mach. Learn. 9Google manifold approximation projection (UMAP),47McInnes Healy Melville Umap: dimension reduction.arXiv. (Preprint at)https://doi.org/10.48550/arXiv.1802.03426Crossref potential heat diffusion affinity-based transition (PHATE).48Moon K.R. van Dijk Gigante Burkhardt D.B. W.S. Yim Elzen A.v.d. Hirn Coifman R.R. al.Visualizing transitions high-dimensional data.Nat. 37: 1482-1492Crossref (285) PCA was selected concerns performance simplicity. Reactionwise prcomp() R stats package,49R Core TeamR: Language Environment Statistical Computing. Foundation Computing, Vienna, Austria2022https://www.R-project.org/Google objects stored. distribution proportion variance first ten principal all plotted 2A, median explains 50% variance, second 25% subsequent tend explain less would expected. value recorded sample, forming samplewise PC1 matrix. routine matrix representing 6,323 genes 10,726 samples. Reaction (multiple reactions) 214 “olfactory receptor-G olfactory trimer formation” (Reactome:R-HSA-381750), log reaction-log 2B.Table 1Tissue datasetGTEx labelNumber samplesAdipose – subcutaneous386Adipose visceral (omentum)234Adrenal gland159Artery aorta247Artery coronary140Artery tibial363Bladder11Brain amygdala81Brain Ant. cin. cortex (BA24)99Brain caudate (basal ganglia)134Brain cerebellar hemisphere118Brain cerebellum145Brain cortex132Brain frontal (BA9)120Brain hippocampus103Brain hypothalamus104Brain Nuc. acc. ganglia)123Brain putamen ganglia)103Brain spinal cord (cervical c-1)76Brain substantia nigra71Breast mammary tissue218Cervix ectocervix6Cervix endocervix5Colon sigmoid173Colon transverse203Esophagus gastro. junction176Esophagus mucosa331Esophagus muscularis283Fallopian tube7Heart atrial appendage218Heart left ventricle271Kidney cortex36Liver136Lung374Minor salivary gland70Muscle skeletal475Nerve tibial335Ovary108Pancreas197Pituitary124Prostate119Skin sun exposed (suprapubic)271Skin (lower leg)397Small intestine terminal ileum104Spleen118Stomach204Testis203Thyroid361Uterus90Vagina97Whole blood456GTEx labels procedure. Open table new tab common assess perform hierarchical clustering, reveal considering similarity. determine whether calculating reactionwise summarization significantly degraded otherwise influenced structure, Euclidean distance performed agglomerative clustering Ward’s D50Ward J.H. Hierarchical grouping optimize objective function.J. Am. Stat. Assoc. 1963; 58: 236-244Crossref (13608) Scholar,51Murtagh Legendre method: algorithms implement criterion?.J. Classif. 31: 274-295Crossref (1882) coordinates compared dendrograms. Cophenetic correlation52Sokal Rohlf F.J. comparison dendrograms methods.Taxon. 1962; 11: 33-40Crossref Scholar,53Sneath P.H. Taxonomic structure.Numerical taxonomy. 1973; 230-234Google calculate similarity distance. cophenetic correlation coefficient 0.9248 2C) permuting dendrogram specified documentation cor_cophenetic() dendextend package54Galili dendextend: package visualizing, adjusting trees clustering.Bioinformatics. 3718-3720Crossref (834) 10,000 (p <1E−5). demonstrates coordinate similarly counts, justifying feature transformation. advantage expressing values, rather distinct formed transformation offers opportunity view lens upper bound Resnet18,41He recommended default selection.55Pointer Programming PyTorch Learning: Creating Deploying Learning Applications. O'Reilly Media, Inc., 2019Google developed Microsoft clas
منابع مشابه
scour modeling piles of kambuzia industrial city bridge using hec-ras and artificial neural network
today, scouring is one of the important topics in the river and coastal engineering so that the most destruction in the bridges is occurred due to this phenomenon. whereas the bridges are assumed as the most important connecting structures in the communications roads in the country and their importance is doubled while floodwater, thus exact design and maintenance thereof is very crucial. f...
Systems biology. Amoeba-inspired network design.
17. J. A. Burns, D. P. Hamilton, F. Mignard, S. Soter, in Physics, Chemistry, and Dynamics of Interplanetary Dust, ASP Conference Series 104, B. Å. S. Gustafson, M. S. Hanner, Eds. (Astronomical Society of the Pacific, San Francisco, 1996), pp. 179–182. 18. B. J. Buratti, M. D. Hicks, K. A. Tryka, M. S. Sittig, R. L. Newburn, Icarus 155, 375 (2002). 19. F. Tosi et al., preprint available at htt...
متن کاملReactome: a database of reactions, pathways and biological processes
Reactome (http://www.reactome.org) is a collaboration among groups at the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University School of Medicine and The European Bioinformatics Institute, to develop an open source curated bioinformatics database of human pathways and reactions. Recently, we developed a new web site with improved tools for pathway browsing a...
متن کاملReduction of dynamical biochemical reactions networks in computational biology
Biochemical networks are used in computational biology, to model mechanistic details of systems involved in cell signaling, metabolism, and regulation of gene expression. Parametric and structural uncertainty, as well as combinatorial explosion are strong obstacles against analyzing the dynamics of large models of this type. Multiscaleness, an important property of these networks, can be used t...
متن کاملQuantum-Inspired Neural Network with Sequence Input
To enhance the approximation and generalization ability of artificial neural network (ANN) by employing the principles of quantum rotation gate and controlled-not gate, a quantum-inspired neuron with sequence input is proposed. In the proposed model, the discrete sequence input is represented by the qubits, which, as the control qubits of the controlled-not gate after being rotated by the quant...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Patterns
سال: 2023
ISSN: ['2666-3899']
DOI: https://doi.org/10.1016/j.patter.2023.100758